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o . 

*vj , The problem of adaptive multivariate function estimation under 

single-index assumption is studied in the framework of the regres- 
C*) , sion model with random design. We consider the case when both the 

link function and index vector are unknown. We propose a novel esti- 
mation procedure that adapts simultaneously to the unknown index 
0^ ' vector and the smoothness of link function by selecting from a family 

CN . of specific kernel estimators. We establish a pointwise oracle inequal- 

ity which, in its turn, is used to judge the quality of estimating the 
entire function (global oracle inequality). Both results are applied 
to the problems of pointwise and global adaptive estimation over a 
collection of Holder and Nikol'skii functional classes. 



1. Introduction. This paper deals with multivariate functions estima- 
tion. We establish local as well as global oracle inequalities and show how 
to use them for deriving minimax adaptive results. 

Model and set-up. We observe data (Xi, Y x ), . . . , (X n , Y n ) GR d xR, 
OO ■ 

(1.1) Y l = F(X l )+e l , i = l,...,n, 

f; ' where d > 2, the noise {ej}™ =1 are i.i.d centered random variables, satisfy- 

o 



ing moment conditions given in Assumption 1 below, and the design points 
{Aj}™ =1 are independent random vectors with common density g with re- 
spect to the Lebesgue measure. The sequences {ej}™ =1 and {Aj}™ =1 are 
assumed to be independent, and the density g is supposed to be known. 

Additionally, we assume that the function F : R — > R possesses a single- 
index structure, that is there exist / : R — > R and 9* £ R such that 



(1.2) F(x) = f(x T 6*)- 

A minimal technical assumption imposed on the link function is that / be- 
longs to some Holder ball. We would like to emphasize that the knowledge 
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of this ball will not be required, in particular this information is not used in 
the proposed estimation procedure (for more detail, see the discussion after 
Assumption 3.) All the results established in the paper, except the lower 
bound proved in Theorem 4, are obtained for d = 2. The principal diffi- 
culties in extending these results to the case of an arbitrary dimension are 
discussed in Remark 2. We note also, that the single-index assumption even 
if d = 2 is a direct generalization of the univariate regression model. As a 
consequence, our results, mainly presented in Section 2.2, generalize in sev- 
eral directions existing ones obtained in the univariate regression model with 
random design, see discussion after Theorem 5 and the references therein. 

Thus, the paper aims at estimating the entire function F on [—1/2, 1/2] 2 
or its value F(x) from the data {(Xj, Yi)}f =1 without any prior knowledge 
of the nuisance parameters /(•) and 9* . The unit square is chosen for the 
notation convenience, and all the results remain true when [— 1/2, 1/2] 2 is 
replaced by an arbitrary bounded interval of M 2 . 

Throughout the paper we adopt the following notations. The joint dis- 
tribution of the sequence {(Xi,Yi)}f =l will be denoted by Fp, those of 

{(Xj,£j)}" =1 by F^ . Moreover F)^ and F^ will be used for marginal dis- 
tribution of {Xi}f =1 and {ej}" =1 respectively. 

To judge the quality of estimation we use either the risk determined by 
the L r norm, || • || r , on [-1/2, 1/2] 2 with r G [l,oo) ("global" risk) 

(1.3) Ki n \F,F)=EP\\F(-)-F(.)\\ r , 
or the "pointwise" risk 

(1.4) K^(F,F) = (EP\F(x) - F(x)\ r ) 1/r , x G [-1/2, 1/2] 2 . 
Here F(-) is an estimator, i.e. an {(Xj, 1^)}™ =1 -measurable function and 

(n) (n) 

Mp denotes the mathematical expectation with respect to F^ . 

Main assumptions. Let us formulate the principal assumptions used in the 
sequel. They are imposed on the distributions of the design and noise vari- 
ables as well as on the approximation property of the link function 

Assumption 1. The random variable e\ has a symmetric distribution 
with density p with respect to the Lebesgue measure. Moreover, there exist 
T > 0, £1 G (0,1], and u > such that 

p£ ( P:=\e:R->R+: I l(y)dy < Te~ nx " , Vx > 
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The assumption holds, for example, for the Gaussian, Laplace or more 
generally for symmetrized Weibull distribution. Throughout the paper the 
functional class ^p is supposed to be fixed. 

Assumption 2. There exists q G (0, 11 such that inf q(x) > q . 

- xG[-5/2,5/2]2 - - 

The assumption holds obviously if the design points are uniformly dis- 
tributed on any bounded Borel set containing [— 5/2, 5/2] 2 . 

Note that imposed condition is "fitted" to the estimation over [—1/2, 1/2] 2 
that motivates the appearance of the set [—5/2, 5/2] 2 . In general case, when 
the estimation problem is considered on some rectangle [a, b] x [c, e] G K 2 , 
the corresponding infimum should be taken over the rectangle [a — 2, b + 
2] x [c — 2, e + 2]. We remark that independently of the values a, b, c, e the 
assumption will be fulfilled if g G C(M 2 ) and g(x) > for any x£l 2 , 

Assumption 3. There exist [3 G (0, 1) and M > such that 

/eF(A,M):=((/:f^I: IIZ7IU + sup \ U ^)~ U ^)\ < M 
I j/i,ot6» 1 2/1 - 3/2 r° 

The latter assumption guarantees that the link function is smooth. How- 
ever, it is important to emphasize that the parameters (5q and M are not 
supposed to be known a priory. In particular, they are not involved in the 
estimation procedure developed in the paper. On the other hand, both pa- 
rameters restrict the minimal sample size needed to justify all the theoretical 
results. Set for any n G N* 



(1.5) /i min = n~ 1 ln 1+ S( n ), f) = \J n" 1 ln 1+ - (n). 
In the sequel it will be assumed that n > no, where 

(1.6) n = inf j m G N* : (M V 1) max (t) 00 , In- (n)/i* n ) < 1, Vn > m 

We finish this discussion with the following remark. All the results obtained 
in the paper remain true if one assumes that / G F(0, M) (the uniform 
boundedness of the link function) and M is known. 

Objectives. For clarity of presentation, we will assume that the index vector 
9* G S 1 , where 8 is the unite sphere in M . However, in Section 2.1.4 it 
is shown that our results can be extended to the case 9* £ K 2 . 

The goal of our studies is at least threefold. First, we seek an estimation 
procedure F(x),x G [—1/2, 1/2] 2 , for F which could be applicable to any 
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function F satisfying (1.2). Moreover, we would like to bound the risk of this 
estimator uniformly over the set F(/3o, M) xS 1 . More precisely, we want to es- 
tablish for F(x) the so-called local oracle inequality: for any x G [—1/2, 1/2] 2 

(1.7) 1li$(F,F)<C r A%l(x), V/gF(/3 ,M), VfeS 1 . 

Here the quantity A\ ea ,{-) is completely determined by the function /, vec- 
tor 9* and observations number n , while C r is a numerical constant inde- 
pendent of F and n. 

Being established the local oracle inequality allows us to derive minimax 
adaptive results for the function estimation at a given point. Indeed, let 
F(7), 7 S T, be a collection of functional classes such that U 7g rF(7) C 
F(/3 , M). For any 7 G T define 

<£ n ( 7 )=inf sup K^(F,F), 

F (/,6»*)eF( 7 )x§ 1 

where infimum is taken over all possible estimators. The quantity 4>n{l) is 
the minimax risk on F(7) x S , and the problem arisen in the framework of 
minimax adaptive estimation consists in the following. One has to construct 
an estimator F* such that for any 7 G T 

(1.8) sup nH(F*,F) x <j> n (j), rwoo. 

(/,0*)€F( 7 )xS 1 

The estimator F* satisfying (1.8) is called optimally rate adaptive over the 
collection {F(7), 7 G T}. Let (1.7) be proved and suppose that for any 7 G T 

sup Ay ] B ,(x)~ 4> n {i), n-^00. 

(/,fl*)GF( 7 )xSi ' 

Then one can assert that the estimator F is adaptive over {F(7), 7 G T}. 

Thus, the first task is to prove (1.7). To the best of our knowledge such 
kind results do not exist in the context of the regression with random design 
not only under the single- index constraint, but also in a univariate regression. 

Next, we apply this result to minimax adaptive estimation over the collec- 
tion F(7) = H(/3, L), 7 = (/3, L), where H(/3, L) is a Holder class of functions, 
see Definition 1. In particular, we find the minimax rate over H(/3,L) x S 1 
and prove that our estimator F achieves that rate, i.e. is optimally adap- 
tive. This result is quite surprising because, if 9* is fixed, say 9* = (1,0) T , 
then it is well known that an optimally adaptive estimator does not ex- 
ist, see Lepski (1990) (Gaussian white noise model), Brown and Low (1996) 
(density model) and Ga'iffas (2007) (regression model). 
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Note also that local oracle inequality (1.7) allows us to bound from above 
the "global" risk as well. Indeed, in view of Jensen's inequality and Fubini's 
theorem 

ni n \F,F)] r < EP\\F(-) - F(-)\\ r r = \\TZP(F,F)\\1 
and, therefore, 
(1.9) KW(F,F)<C r \\A%l\\ r . 

The latter inequality is called global oracle inequality, and in the framework 
of the present study it supplies new results. As local oracle inequality (1.7) 
is a powerful tool for deriving minimax adaptive results in pointwise estima- 
tion so global oracle inequality (1.9) can be used for constructing adaptive 
estimators of the entire function F . 

We will consider the collection of Nikol'skii classes N P (/3,L), see Defini- 
tion 2, where f3,L > and 1 < p < oo. It is important to emphasize that 
by considering these classes we want to estimate the functions with inhomo- 
geneous smoothness. This means that the underlying function can be very 
regular on some parts of the observation domain and rather irregular on the 
other sets. 

We will compute the asymptotic bounds on 

sup 1 1 Afg* 1 1 

and show that, if (2/3 + \)p < r, the rate of convergence coincides with 
the minimax rate over N P (/3,L) x S 1 . This means that our estimator F is 
optimally rate adaptive over the collection |N P (/3,L) x S 1 , /3 > 0, L > 0} 
whenever (2/3 + l)p < r. In the case (2/3 + \)p > r we will prove that the 
latter bound differs from the bound on the minimax risk by a logarithmic 
factor. Following the contemporary language we say that the estimator F 
is "nearly" adaptive. However, the construction of an optimally rate adap- 
tive over the entire range of Nikol'skii classes estimator under single-index 
constraint (1.2) remains an open problem. 

We would like to emphasize that all these results are completely new. The 
adaptive estimation under the L r loss and single- index constraint, except the 
case r = 2, Gai'ffas and Lecue (2007), was not studied. Note, however, that 
the cited results were obtained under the Gaussian errors model and over 
the collection of Holder classes that does not admit the consideration of 
inhomogeneous functions. 
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Remarks. It turns out that the adaptation to the unknown 6* and /(•) can 
be formulated in terms of selection from a special family of kernel estimators 
in the spirit of the Lepski and the Goldenschluger-Lepski selection rules, 
see Lepski (1990), Kerkyacharian et al. (2001), Goldenshluger and Lepski 
(2008). However, the proposed here procedure is quite different from the 
aforementioned ones, and it allows us to solve the problem of minimax adap- 
tive estimation under the L r losses over a collection of Nikol'skii classes. 

It is worth mentioning that the considered single-index model is not only 
of high theoretical interest but is also actively used especially in economet- 
rics, e.g. Horowitz (1998), Maddala (1983). The estimation, nevertheless, is 
usually performed under smoothness assumptions on the link function. One 
usually uses the L2 losses, and the available methodology is based on these 
restrictions. To the best of our knowledge the only exceptions are Golubev 
(1992) for the minimax estimation under the projection pursuit constraints, 
and Goldenshluger and Lepski (2009) presenting a novel procedure permit- 
ting to adapt simultaneously to unknown smoothness and structure. 

Organization of the paper. In Section 2 we motivate and explain the pro- 
posed selection rule and establish for it local and global oracle inequalities, 
Section 2.1. Section 2.2 is devoted to the application of these results to mini- 
max adaptive estimation. The proofs of the main results are given in Section 
3, and the proofs of technical lemmas are postponed until Appendix. 

2. Main results. In this section we present our procedure and establish 
for it local and global oracle inequalities. Then, we apply these results to 
adaptive estimation over a collection of Holder classes (pointwise estimation) 
and over a collection of Nikol'skii classes (estimating the entire function with 
the accuracy of an estimator measured under the L r risk). 

2.1. Oracle approach. Let K, : R — > R be a function (kernel) satisfying 
condition f K = 1. With any such /C, any z € R, h € (0, 1] and any / G 
F(/3q,M) we associate the quantity 



Ajc,f(h,z) =sup 



8<h 



r-l 



JC{[u-z]/5){f(u)-f(z))du 



We note that 1/5 f K.([u — z]/5) f(u)du (kernel smoother) can be understood 
as an approximation of the function / at the point z. Thus, A/^j(h,z) is 
a monotonous approximation error provided by this kernel smoother. In 
particular, A^j(h, z) — > as h — >■ in view of Assumption 3. 

Throughout the paper ||/C|| P , 1 < p < 00, denotes the L p norm of K. and 
we will assume that the kernel /C satisfies the following condition. 
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Assumption 4. (1) supp(IC) C [-1/2,1/2], J K, = 1, K, is symmetric; 

(2) there exists Q > such that 
|/C(u) -K{v)\ <Q|u-v|, Vu,v el. 

2.1.1. Oracle estimator. For any j/£l define 

A* KJ (h, y) = sup(2a)~ 1 I A KJ (h, z)dz. 

a>0 Jv-a 



Thus, A£- Jh, ■) is the Hardy-Littlewood maximal function of A/c j(h, •), see 

for example Wheeden and Zygmund (1977). Note that since / 6 ¥(Pq,M) 

soA* KJ {h,-)>A Kjf (h,-). 

Now we are in a position to introduce the oracle estimator. Set for any j/£K 

(2.1) h* KJ (y)= su V [he [hminA] ■ V^hA* KJ (h,y) < \\)C\\ 00 y / h^Y 

where h m \ a is defined in (1.5). 

Some remarks are in order. First, we note that A£- Ah, •) < M\\K\\ihr° for 
any / £ F(/3o, M) and any /t > 0. Hence, \/nh m i n A^ r(/i m ; n , •) < ||/C||i-y/ln(n) 
for any n > no in view of (1.6). Next, Assumption 4 (2) implies obviously 
that Ajk f(-,y) is a continuous function and, hence, 

(2.2) either y/nh* KJ (y) A* KJ (h* KJ (y),y) = H^UVRnj, 



(2.3) or M A* K j(h, y) < ||/C||ooVm(n), V/i G [h min , 1]. 

Here we have also used that ||/C||i < ||£||oo i n view of Assumption 4 (i). 

The quantity similar to the defined above /ij^- -(•) first appeared in Lepski et al. 
(1997) in the context of the estimating univariate functions possessing inho- 
mogeneous smoothness. Some years later this approach has been developed 
in Kerkyacharian et al. (2001) and Goldenshluger and Lepski (2008) for the 
multivariate function estimation. In these papers, the interested reader can 
find a more detailed discussion of the oracle approach. In the present paper 
we try to adopt the "ideology" proposed in the aforementioned papers to 
the estimation under single- index constraint. Our main idea is based on the 
following rather simple observation: 

For any (9, h) € S 1 x [/i m j n , 1] define the matrix 

E W ~ { -9 2 01 
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and consider the family of kernel estimators 

det(E {d)h) ) " K(E ie , h) (Xi--)) 



JF 



\e,h)\ 



n 



t=l 



g(Xi) 



Yi, (9,h)eS l x[h XQtil ,l}y 



Here K(u,v) = JC{u)JC{v). We remark, first, that in view of Assumption 2 
the estimator F^ eh ^ is well defined since 

K(E m (t-x))=0, VtG [-3/2,3/2], Vx G [-1/2, 1/2]. 

This property follows from Assumption 4 (i). Moreover, det (Erg h \) = h . 
The choice 9 = 9* and h = h* '■= h*~ Ax T 9*) leads to the so-called oracle 

estimator i ? (e*,h*)(-)- First, we note that ify* ,/!*)(•) is not an estimator in 
the usual sense, since it depends on the function F to be estimated (more 
precisely on (/,#*) which determines F). The meaning of this estimator is 
explained by the following result. 

Proposition 1. For any (f,6*) G F(/3 ,M) xS 1 , r > 1 and any 
n > uq 



n^{F {9 , M) ,F)<c r 



ln(n) 



VxG [-1/2,1/2]' 



nh* K Ax T 9*) 

where c r > is a numerical constant independent of n. 

The proof of the proposition, based on the Rozenthal inequality, is straight- 
forward and can be omitted. 

The result of Proposition 1 can be treated as follows. The "oracle" knows 
the exact value of the index vector 9* and the optimal, up to ln(n), trade- 
off h* between the approximation error determined by A£- Jh*,-) and the 
stochastic error provided by the kernel estimator from the collection T with 
bandwidth h* . It explains why the "oracle" chooses the "estimator" Ftp* ^*\. 

In the next paragraph we propose a "real" (based on the observation) 
estimator F(-), which mimics the oracle estimator. This means that for any 
(/,#*) GF(A),M) xS 1 , xe [-1/2, 1/2] 2 , r > 1 and n > n 



Kty{F,F)<c' r 



ln(n) 



nh* K Ax 1 



,VxG [-1/2, 1/2] 2 



where c' r is an absolute constant independent of the number of observations n 
and the underlying function F. The latter result is a local oracle inequality. 
The construction of the estimator F(-) is based on the data-driven selection 
from the family T . 
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2.1.2. Selection rule. For any #, ^ G S 1 and any h G [/tmiiu 1] define 



E 



{e,h){u,h) 



2h{l+\u T 9\) 2h{l+\v T 9\) 

{82+V2) (fli+t/i) 

2(l+|i/T0|) 2(14> T 0|) 



where 



E 



E 



{G,h){v,h) 



(0,h)(v,h): 



U T 9 > 0; 



E (-6,h)(v,h)i v 8 < 0- 



It is easy to check that (4/t) x < det \E(o,h)(v,h)) ^ (2/ 1 ) ■ A kernel 
estimator associated with the matrix E^^^^ is defined by 

det (£ , (6),/i)(^,/ l )) ^ K (E{9,h)(v,h)( x i ~ •)) 



(2.4) 



(9,h)(v,h) 



(x) 



n 



i=l 



<7pQ 



-1*. 



Once again we note that the estimator FiQ h \i vh \ is well-defined since 

K(E (e>h) (t - x)) = 0, Vt G [-5/2, 5/2], Vx G [-1/2, 1/2]. 
For any U\,U2 G R set ^(ui, U2) = f)~~/C(ui/f))/C(tt2/f)) and introduce 

n 

F(t) = n-^g'H^K^Xi - t)Yi, Foe = 2||F||oo + 2C 5 (n), 

i=l 
where ||-F||oo = su Pte[-i/2,i/2] 2 |-^(*)|) an d f) is defined in (1.5). Put also 
THfa) = 2[\\JC\\ 2 00 V / Hn) + ^oCi(n) + C 2 (n)] (^n)" 1 ^, ^ G ( , 1]. 

The quantities Ci(n), C2(n) and Cs(n) are listed at the beginning of 
Section 3.1. Those explicit expressions are too cumbersome and it is not 
convenient for us to present them right now. 

Set H n = {h k = 2~ k , k = 0, 1, . . . } n [2" 1 /i min , l] and let for any 9 G S 1 
and h G %„ 



R<£\h) 



sup 



SUP \F(fi,r,Xu,v)( x ) ~ F M)( X )| - TH ( 7 ?) 



sup sup 



F (eth) (x)-F l9iri) (x)\-TH(ri) 



Define (9, h) as a solution of the following minimization problem: 
R^(9,h) + R^(h) + TB(h) 



(2.5) 



inf 
(0,/i)e sl x?^ 



R^(9,h) + R^(h)+TB(h) 



Our final estimator is F(x) = F^rAx) . 
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Remark 1. We note that all random fields involved in the descrip- 
tion of selection rule (2.5) are continuous functions on S 1 thanks to As- 
sumption 4 (2) and, moreover, rt n is finite. Thus, (0,h) is {(Xi,Y i )}f =1 - 
measurable and (6,h) £ S 1 x ri n , see Jennrich (1969). 

Remark 2. Our selection rule (2.5) is defined in the case d = 2. The 
main difficulty in extending it to d > 2 consists in the construction of the 
matrix -E(e m(i/,/i) for any vectors 9,u £ S . Indeed, analyzing the proof of 
Theorem 1 we remark that the following properties should be fulfilled. 

E {6,h){v,h) £ £a,A, E(0,h)(u,h) = ± E {v,h)(6,h)i V#, Z^ £ § ~ , V/i £ H n , 

where the class of matrices £ a ,A "is defined in (3.2). If d = 2, these re- 
quirements hold. However, we were not able to construct a class of matrices 
obeying latter restrictions in the dimension strictly larger than 2. Note nev- 
ertheless that if such class would be found our results could be extended to 
d > 2 without any additional consideration. 

2.1.3. Local and global oracle inequalities. To formulate our main results 
we need to enforce restriction (1.6) imposed on the minimal sample size n. 
Let m > 1 be defined as follows 

(2.6) m = inf im £ N* : (nt) 2 )~ ^ C 3 (n) < 1/2, Vn > m\, 

where f) is defined in (1.5), and 63(71) is given at the beginning of Section 3.1. 
All our results below will be proved under the condition n > no V n\. 

First, we note that n\ is well-defined since (nf) 2 ) C%(n) — > as n — > 00. 
Next, contrary to restriction (1.6) that relates the sample size n to the quan- 
tities /3q,M appeared in Assumption 3, restriction (2.6) links the minimal 
value of n with the quantity g appeared in Assumption 2. 

Theorem 1. For any (f,6*) £ F(/3 ) x S 1 , x £ [-1/2, 1/2] 2 , r > and 
n > uq V n\ 



The constants c r \ and c r< 2 are independent of n and F and their explicit 
expressions can be extracted from the proof of the theorem. 

As already mentioned, the global oracle inequality is obtained by integrat- 
ing the local oracle inequality. Indeed, using Jensen's inequality and Fubini's 
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theorem we get 1l\ >{F,F) < \\R\'y (F , F)\\ so 



Tli n \F,F) < c rjl 



I Iia 

2'2J 



ln(n) 



ra/i* Ax T 0* 



Integration by substitution yields 

ln(n) 



1 



1 ^1 

2 ' 2 J 



dx < 



dx> + c r 2^ 



ln(n) 



nhfc Az) 



dz, 



L]2 \nh* K Ax T 9*) 
that leads to the following bound. 

Theorem 2. For any (/, 0*) G F(/3 , M) x S 1 , r > 1 and n > n V m 

i 
ln(n) 



K^lF^FJKCr,! 



nh K j 



+ c r on 



it), r 



2.1.4. Extension to the case 6* £ S 1 . Define /e*(t) = /(|^ 
6>7|0*| 2 and let F 6 *(x) := f e *(x T $*). Obviously, 

fe*{x T r)=f(x T 6*), Vx =* F,.(.)=F(-) 

so the estimation of F(-) is equivalent to the estimation of Fg*(-). Moreover 
tfeS 1 and, therefore, results obtained in Theorems 1 and 2 are applicable. 
To do that it suffices to replace / by fg* in the definition of h*^ A-). We 
note, however, that there is no any general receipt of expressing h*^ f t (•) 
via hfc t(-), although in particular cases (mainly in adaptive estimation over 
the collection of classes of smooth functions) it is often possible. 

2.2. Adaptive estimation. In this section we first apply the local oracle 
inequality given in Theorem 1 to the problem of pointwise adaptive estima- 
tion over a collection of Holder classes. Next, we study the problem of adap- 
tive estimation under the L r losses over a collection of Nikol'skii classes. The 
corresponding results are deduced from the global oracle inequality proved 
in Theorem 2. 

Throughout this section we will assume that the kernel 1C obeys addition- 
ally Assumption 5 below. Introduce the following notation: for any a > let 
m a S N be the maximal integer strictly less than a. 

Assumption 5. There exists b > such that 

z 3 K{z)dz = 0, \/j = 1, . . . , irib- 
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2.2.1. Pointwise adaptive estimation. We start this section with the def- 
inition of the Holder class of functions. 

Definition 1. Let j3 > and L > 0. A function £: R ^ R be- 
longs to the Holder class H(/3, L) if I is mp-times continuously differentiate, 
P (m) ||oo < L, Vm< m/3, and 

^ m /3)(i + /i)-^ m ^(i) <LhP- m ^ VteR,h>0. 

The aim is to estimate the function F(x) at a given point x G [—1/2, 1/2] 2 
under the assumption that F G F(6) := |L <6 |J L>0 F2(/3,L), where 

F d (/3,L) = {F:M d ^M|F(z) = /(z T 0), f eM(/3,L), G S^ 1 } , 

the constant 5 is from Assumption 5, and d > 2 is the dimension. We will 
see that b can be an arbitrary number but it must be chosen a priory. 

Theorem 3. Let b > be fixed and let Assumptions 4 and 5 hold. Then, 
for any j3 < b, L > and x G [-1/2, 1/2] 2 , 

sup ft$ (fe)' F ) < ^i^n(/3, L), 
FeF 2 (,3,L) v v ' ; 7 

1 _ p 

where ijj n (j3,L) = L 2 P+ 1 (n ln(n)j 2/3+1 and k\ is independent of n. 

The proof of the theorem it is based on the evaluation of the uniform, 
over H(j(/3, L), lower bound for hfe *(•) and on the application of Theorem 1. 
We note that a similar upper bound for the minimax risk was established in 
Goldenshluger and Lepski (2008) in the framework of Gaussian white noise 
model, but the estimation procedure used there is different from our selection 
rule. 

The main question, however, is whether ijj n (/3,L) coincides with the min- 
imax rate of convergence for any given value of /3 and L? To answer this 
question we will need some additional assumptions on the density of the 
noise variable E\ and design variable X\. 

Assumption 6. There exist q,H > such that for any v i,t>2 G [— q, q] 

p{y + vi)p(y + v 2 ) ^| I 

— dy < 1 +Q\viV2\. 

p{y) 
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It is easy to see that the density of the normal law A/"(0, a 2 ), a 2 > , obeys 
the aforementioned assumption. In general, this assumption is fulfilled, if the 
density p is regular and decrease rapidly at infinity. More precisely, if the 
Fisher information corresponding to the density p is finite and the function 
J [p'{y + ")] P~ l {y)dy is continuous at zero then Assumption 6 is verified. 

Assumption 7. There exist 3 > and w > 1 such that 
g(x) < - — t— , Vx G R d , 

where | • I2 is the Euclidian vector norm on M. d . 

We remark that imposed assumption is very weak and it is checked for 
the majority of probability distributions emerging in statistics. 

Theorem 4. Let Assumptions 6 and 7 be fulfilled. Then for any x G 
[-1/2, l/2] d , d>2, r >1 , P,L > 0, and any n G N* large enough, 

inf sup Tlfl (F, F) > x 2 Vn (13, L), 

F F& d (/3,L) V ' 

where the infimum is over all possible estimators. Here xi is a numerical 
constant independent of n and L. 

To the best of our knowledge this lower bound is new. We would like to 
emphasize that Assumption 6 under which this theorem is proved is close to 
be necessary. It is not difficult to provide examples in which this condition 
does not hold and the assertion of Theorem 4 is not true anymore. 

We conclude from Theorems 3 and 4 that the estimator F,st\ is minimax 

adaptive with respect to the collection of classes {Frf(/3, L), j3 < b, L > 0}. 
As already mentioned, this result is quite surprising. Indeed, if for example 
the directional vector 9 = (1, 0) , i.e. is known, then F(/3, L) = H(/3, L), and 
the considered estimation problem can be easily reduced to estimation of / 
at a given point in the univariate regression model. As it is shown in Gaiffas 
(2007) the adaptive estimator over the collection {H(/3,L), /3 < b, L > 0} 
does not exist and the prise to be paid for adaption appears. The latter 
means that the asymptotic bound on the minimax risk provided by adaptive 
estimator differs from the minimax rate of convergence by some factor. This 
factor for the majority of known results is ln(n). 

Also, we would like to emphasize that the assertion of Theorem 4 is proved 
for arbitrary dimension. 
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2.2.2. Adaptive estimation under the L r losses. We start this section 
with the definition of the Nikol'skii class of functions. 



Definition 2. Let (3 > 0, L > andp G [1, oo) be fixed. A function £ : 
M — > M. belongs to the Nikol'skii class N p (/3, L) if £ is mp-times continuously 
differentiable and 



6 m) {t) 



i 

dt 



< L, Vm = 0, . . . ,mp; 



£(rnp)( t + K) _ £ (m )^ dz < jtf-m* ? y/t > 



Later on we assume that N p (/3, L) = H(/3, L) if p = oo. 

Here the target of estimation is the entire function F(-) under the as- 
sumption that F G ¥ p (b) := U / 9<bUL>o^ 2 .p(^'-^)' wnere 



F d , p (P,L) = {F: 



F{z) = f{z l 6), /GN p (/3,L), fleS^ 1 }. 



Let us briefly discuss the applicability of Theorem 2 which requires that 
/ G F(/?o,Af). In order to guarantee it we will assume that /3p > 1. The 
latter assumption is standard in estimation of functions possessing inho- 
mogeneous smoothness, see for example, Donoho et al. (1995), Lepski et al. 
(1997), Kerkyacharian et al. (2008). If {3p > 1 the embedding N P (/3,L) C 
H(/3 — 1/p, cL) with some absolute constant c > guarantees that / G 
F(/3o, M) with /3q = (3 — 1/p, M = cL so Theorem 2 is applicable. 

Theorem 5. Let b > be fixed, and let Assumptions 4 an d 5 hold. 
Then, for any L > 0, p > 1, p~ l < j3 < b and r > 1, 



sup nW[F (9 fr,F)<x 3 <p n (P,L,p), 

FeF 2 , p (/3,L) 



(tf,fe)> 



where X3 is independent of n ant 



VniPiLip) 



L i/(2^+i) ( n -i ln( n )) w, 
L i/(2/3+i) ( n -i in( n )) 3&i [ln(n)]' 

l/2-l/r /3-l/p+l/r 

k L^-i/ P +i/2 ( n - 1 ln(n)) 2 ' 3 - 2 /p+ 1 , 



(2/3 + \)p > r; 
(2/3 + l)p = r; 

(2/3 + \)p < r. 



Note that F2, p (/3, L) D N p (/3, L). Indeed, the class N p (/3, L) can be viewed 
as a class of functions i 7 satisfying F(-) = f(0 T -) with 6 = (1,0) T . Then, 
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the problem of estimating such (2-variate) functions can be reduced to the 
estimation of univariate functions observed in the one-dimensional regression 
model. 

There are at least two observations appearing in view of the latter re- 
mark. First, the upper bound given in Theorem 5 generalizes the results 
obtained in the univariate regression, see for instance, Donoho et al. (1995), 
Delyon and Juditsky (1996), Baraud (2002), Kerkyacharian and Picard (2004), 
Kulik and Raimondo (2009), Zhang et al. (2002) in several directions. In 
particular, the majority of the papers treats the Gaussian errors or the er- 
rors possessing exponential moment. The exception is the paper Baraud 
(2002) in which some part of results is obtained under very weak assump- 
tion imposed on the noise (weaker than our Assumption 1). However, these 
results are available only if p = r = 2. 

Next, the rate of convergence for the latter problem (which can be found 
in Chesneau (2007)) is also the lower bound for the minimax risk defined on 
¥2, P (/3,L). Under assumption ftp > 1 this rate of convergence is given by 



MP,L,p) 



£l/(2/3+l) n - 2^i ; ( 2 p + \)p > r; 

£1/(2/9+1) ( n -l ^{p)} 50+i > (2/3 + l)p = r; 

l/2-l/r 0-1/p+l/r 

L 0-i/ P +i/2 ( n -i ]n(ri)) 2 ?- 2 /p+ l , (2/3 + l)p < r. 



The minimax rate of convergence in the case (2/3 + \)p = r remains an open 
problem, and the rate presented in the middle line above is only the lower 
asymptotic bound for the minimax risk. 

Thus the proposed estimator F,^ y\ is adaptive whenever (2/3 + \)p < r. 
In the case (2/3 + l)p > r we loose only a logarithmic factor with respect to 
the optimal rate and, as mentioned in Introduction, the construction of an 
adaptive estimator over the collection {F2 >p (/3, L), /3 > 0, L > 0} in this 
case remains an open problem. In view of the latter remark we conjecture 
that the presented lower bound is correct and, therefore, the upper bound 
result has to be improved. 

3. Proofs. We start this section with presenting the quantities used in 
the description of the selection rule led to the adaptive estimator F(§f\- 
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3.1. Important quantities. Set 

Cl (n) = 730ln(l6n 2 g_~^[l2Q + V2]) + 8r ln(n) + 394; 

c 2 (n) = 730\n(l6n 2 Tg-^[l2Q + V2\) + 8r ln(n) + 394; 

c 3 (n) = 365 In (2n 2 Qg^ + 8r ln(n) + 197; 

c 4 (n) = 365 In (2n 2 TQg_~^) + 8r ln(n) + 197, 

and define 

Ci(n) = 2V25-5||/C||L\/^M+(8/3)c 1 (n)(ln(n))-^5- 1 ||/C||L; 
C 2 (n) = 2V2(aV%^||/C||Lv / ^R 

+ (8/3)c 2 (n)(ln(n))~V 1 ||/C||L(^ 1 (4r + l))-; 
C 3 (n) = 2V25-^l|/C||L\/QR+(8/3)£- 1 ||/C||Lc3(n)(nf 1 2 )^^; 
C 4 (n) = 2y2(aVl)£-5||/C||^ v / ^) + (8/3)T C4 (n)(nf ) 2 )~V 1 ||/C||L; 
C 5 (n) = ||/C|| 2 + (nr, 2 )~^ 4 (n), 

where, remind r) is given in (1.5), and 

a 2 = sup / x 2 p(x)dx, r= (0~ 1 (4r + 1) ln(n))". 

In spite of the cumbersome expressions, it is easy to see that 

C(n) 
(3.1) sup ^_^ = =: Ci < oo, i = 1,2, supC5(n) =: C5 < 00. 

n>3 a/ ln(n) n>3 

3.2. Proof of Theorem 1. We start the proof with establishing technical 
lemmas whose proofs are postponed to Appendix. 

3.2.1. Auxiliary results. For any 9,u £ S 1 and h G [2 -1 /i m i n , l] denote 
S(e,h)(y,h){x) = det(E {eA)i ^ h) ) K(E {eA)il/A) (t-x))F(t)dt, 
S m (x) = det (£ (<w ) y K(E m (t - x))F(t)dt. 
For ease of notation, we write h** = h* K Ax' 1 9*). 
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Lemma 1. Grant Assumption 4- Then, for any u £ S 1 and any band- 
widths n,h € [2 -1 /i m j n , l] satisfying rj < h < 2~~ l h*?, one has 



\S {6 *, h)iv , h) (x) - S (v>h) (x)\ < 2(h* f r 1 / 2 \\IC\\l y/n-nn(ny, 
\S {v>h) (x) - S { „ jV) (x)\ < 2(/ i })- 1 / 2 ||/C||L^- 1 ln(n); 
\S {g . th) -F(x)\ < (^r^ll/CllooV"- 1 !^). 

Let £ a ,A with a € (0, 1], A > 1, be a set of 2 x 2 matrices satisfying 



(3.2) 



det(£)| < A, \E\oo < (\Z2a) _1 | det(S)|. 



Here |-E7|oo = m axjj |i%j| denotes the matrix supremum norm. Set for 
any E £ £ aA 



J(x,E) = yf\det(E)\K(E(x-t))\g(x)]- 1 , x G R 2 ; 
and consider the following random fields defined on £ a ,A'- 

i n 
i ]n , t (E) = —J2{j(X i ,E)F(X i )-W.P [J(X i ,E)F(X i )\] , 

1 " 

\/n . 

Denote finally by £* the set of matrices £ a ,A with a = 1/8 and A = {h m \ n ) . 
Lemma 2. Grant Assumptions 1-4- Then, for any n > 3 and any r > 1, 

p xl( SU P \\Vn,t(E)\ + \U.t(E)\] > CiHIlFHoo +C 2 (n)l < (8 + T)n~ 4r . 

' I Ee£ t L J J 

TTie expressions C\{n) and C2(n) are given in Section 3.1. 

Lemma 3. Grant Assumptions 1-4- Then, for any n > no V n\, 

sup sup P^JFoo i [||F|| 00 ,3M + 4C 5 (n)U < (8 + T)n" 4r . 

9* 6S 1 /GF(/3 ,A/) I L J J 

The numbers n,Q,n\ are defined in (1.6) and C${n) is defined in Section 3.1. 
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3.2.2. Proof of Theorem 1. In view of Jensen's inequality an upper bound 
for lZr™x , T > 2 , will suffice to complete the proof. 

Let h* £ 7i n be such that 2h* < h* f < Ah* . Introduce the random events 



7 



A = {Rg)(6*,h*) + R X 2) (h*) = o} , B = JFoo G 



| J F|| 00 ,3M + 4C{ 



(»)] } , 



and let A and B denote events complimentary to A and B respectively. We 
split the proof into three steps. 

Risk computation under AC\B . First, the following inclusion holds: 
(3.3) Ac{h>h*}. 

Indeed, in view of the definition of the couple (6, h) we have 
l A TR{h*) = l A iRW(9*,h*) + RW(h*) + TH.(h*)\ 



> 



l A (R^(9,h) + R x 2) (h) + TH(h)\ > l^TH(ft). 



It remains to note that the mapping r\ \- > TH(?7) is decreasing so inclu- 
sion (3.3) follows. Next, the triangle inequality yields 



F (6h)W- F W ^ F {e»,h')(x) - F(x) + F {m {x)-F dht {x 



(8,h) y 



(e,h*y 



+ 



F, 



?*,h*)(e,h*)( x ) F (e,h*)( x ) 



(3.4) 



+ 



F, 



)*,h*)(e,h*)( x ) ~ F (e*,h*)(x) 



1°. We have in view of (3.3) and the definition of R x that 



(2) 



(3.5) 



U 



F (0h)( x ) ~ F (6h4 x ) < U[R x 2) (h) + TR(h*)]. 



(«,/»•) 



W/ 



The definition of R x (•, •) implies 



u 



F (6*,h*)(0,h*)( X ) 



F ( g h Ax) < l A [R^(8*,h*) + TB(h*)] 



(3.6) 

Note that E t 



(6,h)(v,h) 



±E, 



(u,h)(fi,h) 



= UTH(/i*). 
for any 8, v and /i. Hence, 



F (6*,h*)(6,h*)(') — F (6,h*)(0* ,h*)t) 
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since K, is symmetric. It yields together with (3.3) and the definition of W £ 



(i) 



u 

(3.7) 



F { e*,h*)(lh*)( x )- F (e*,h*)( x ) 



U 



>)i6 ^ ht) (x)-F {9 *,h*)(x) 



< l A [RW(e,h) + TH(h*)]. 
We obtain from (3.4), (3.5), (3.6) and (3.7) that 



u 



F m ^ ~ F ^ - lA i 11 ^ & Q + R x ] W] + 3 TH ^ 



+ 



^•.h*)^) --^(^ 



Noting that in view of the definition of (8, h) 

R^(9,h) + R^(h) < R^(9,h)+R^(h) + TR(h) 

< R£\e\h*) + RgXh*) + TK(h*), 

we obtain 

(3.8) l A \F {m (x) - F(x)\ < 4TR(h*) + \F ie »^ } (x)-F(x) . 

Note also that for any r\ G 7i n 

l B TH(i7) < 2[||/C||L7rnM+(3M + 4C'5)C'i(n) + C 2 (n)](r ? n)-5 

where C 6 = 2||/C||^ + 2(3M + 4C 5 )d + 2C 2 and Ci, C 2 and C 5 are defined 
in (3.1). Since TH(/i*) < TH(/i}/4), this bound and (3.8) yield 



<8C 6 J^4 



77. /i* 



ify.^x)-^) 



(3.9) l AnB F m (x) - F(x) 

2°. Note that E^^ vh ^ E{e,h) £ £* f° r an y #, z' G S 1 , /i G [^mmj !]• Set 



F(E,x) = det(E)^2K(E(X l - xfig- 1 (XjYi, £ G £*• 



i=l 



The following "approximation + stochastic part" decomposition of F(E, x) 
will be useful in the sequel: 



F(E,x) = det{E) K(E(t-x))F(t)dt 



(3.10) 



+ y/det(E)/n[r) n>t (E)+Z ntt (E)], 
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where rj n j(E) and £ n j(E) are denned before the statement of Lemma 2. 
Thus, we have 

F^^ix) - F(x)\ < |V^)-F(x)| 
+ (n' 1 det (£(*.,/,.))* [»*»,*(£(*•,&*)) + £n,* ((%*,?>.))] • 

Taking into account that det (-EVg* ^*) ) = (/i*)" 1 < 4(/i^)~ 1 in view of the 
definition of h* and using the third assertion of Lemma 1 we obtain 

< [K})5] [VmHII^IIoc + 2|r/ n , t ( J E7 (fl . )h . ) ) +$„,t(^., h .) 

Applying the Rosenthal inequality to Vn,t(E(g*^*)) + £,n,t{E(g* ^*)) which is 
a sum of centered independent random variables we obtain from (3.9) 



(3.11) 



E 



(n) 



%,£) ( x ) ~ F ( x ) 1 - 4nB } ' - c ° ^/(^^/) _1 ln(n), 



where Co is independent of F and n. 

i?isfc computation under B. Since / £ F(/3o,M) we have the following 
obvious bound 



8=1 



where we have also used that n/i m ; n > 1. Hence, in view of the Rosenthal 
inequality we obtain 



E 



CO 



F m (x)-F(x] 



■2r 



< nci, 



where c\ is independent of F and n. 

The use of the Cauchy-Schwartz inequality together with the statement 
of Lemma 3 lead to the following bound: 



(3.12) {e£> \F m (x) - F(x)\ r l^y < nci 



p( n) (F)l 2r ^ciCS + TJ^n- 1 . 
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Risk computation under A n B . We note that 

¥P(AnB) < FP{nV>(e*,h*) > o, e} + p? ) { J Ri 2) (/i*) > o, b}. 

1°. First, let us bound from above P^ji?^^*, h*) > 0,b}. We have 
{R^(9*,h*) >()} = |J { sup \F {g . Mv>ri) (x) - F { „ >v) (x)\ > TH (r,)} 



riGM n ,V<h* 



and, therefore 

(3.13) P^ } {i?W(r ; /i*) > 0, b\ < 

FP jsup |F (fl . i2 - fc)(l/i2 - fe) (x) -F (Vi2 - fc) (x)| :> TH(2^ fc ), B 



fc: 2- 1 h min <2- k <h* 



U& 1 



Thus, denoting by ? n = sup^g^ [|r? n ,t(i?)| + |^ n ,t(^)|] and using (3.10) 
together with the first assertion of Lemma 1 we obtain for any k : 2~ k < h* 



sup 

U& 1 



F (0* ,2- k ){v,2~ k ){ x ) ~ F (u,2~ k )( x ) 

C2(h* f )- l l 2 \\lC\\l ^n~nn(n) + 2V¥n- l ^ qn 
< 2 ||/C||£, J2 k n~ l ln(n) + 2(2 fc n" 1 ) 1/2 ?n . 



(3.14) 



Here we have also used that 2 1 h*r > 2 . Note also that 



(3.15) l s TH(r?) > 2||/C 



|2 

loo -1 



'ln(n) 2 



{HFlUCi^ + CaCn)}, 



and, therefore, we obtain from (3.14) for any k satisfying 2~ k < h* 

P ( P (sup |iV,2-*)(,, 2 -*)(*) " Fr Vi2 - k) (x)\ > TH (2~ fc ), B 

Ues 1 



<P£{^ > ||i ? ||ooCi(n) + C 2 (n)j < (8 + T)n 

in view of Lemma 2. It yields, together with (3.13) 
(3.16) PP(R^{6*,h*) > 0, B) < (8 + T)log 2 (n)n" 



- It 



-It 
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2°. Now, let us bound from above FP{l?g\h*) > 0, b\. We have 



{R^(h*)>0}= |J 

■qeH n : n<h*<h* 



sup 

0GS 1 



F {e , h *)(x) - F m (x) >TH(2~ fc ) , 



hence, 
(3.17) 

k: 2- 1 h min <2- k <h* 



FP{R^(h*)>0, B} 
{sup F (eM) (x) - F {dt2 - k) (x) >TH(2- fc ), b) . 



Using (3.10) and the second assertion of Lemma 1 we obtain for any k 
satisfying 2~ k < h* 



sup 

ees 1 



F (e,h*){x) - F {d)2 - k) {x) 



< 2 ||/C||^v / 2 fc n- 1 ln(n) + 2(2 



k„-i\i/2 



n 



fn- 



Applying Lemma 1, we get in view of (3.15) for any k satisfying 2 < h* 



i n) i sup 



F{e,h*){x) -F {6:2 - k) (x) 



>TH(2~ fc ), B 



< P^J ?n > ll^llocCiCn) + C 2 (n) \<(8 + T)n~ 4r . 



F l pUu^(h*) >0, 



It yields, together with (3.17) 

Thus, we obtain from (3.16) and (3.18) that 

fPCACiB) < (8 + T)21og 2 i 
It yields together with (3.2.2) that 



Ir 



n)n 



- lr 



An) 



(3.19) E™ F, n J X )-F{x) 



1 



AnB 



< nc\ 



*P{AC\B) 



2 r . - jL 

< C2n 2 , 



where c 2 is independent of F and n. 

The assertion of the theorem follows now from (3.11), (3.12) and (3.19). 
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3.3. Proof of Theorem 3. Using the standard computation of the bias of 
kernel estimators, under Assumptions 4 and 5 we get for any / £ HI(/3, L) 
and any z£l 



A;c/(M) < 






< WlCWooLh?. 



Since the right-hand side of the latter inequality is independent of z, we have 



A* KJ (h,z) < ||/C|| 00 L/i' 3 . This implies h* Kf {z) > ((Ln)" 1 ln(n)) 
any z £ R so the assertion of the theorem follows from Theorem 1. 



1/(2/3+1) 



for 



3.4. Proof of Theorem 4- We start this section with an auxiliary result 
used in the proof of the second assertion of the theorem. It was established 
in Kerkyacharian et al. (2008), Corollary 2 of Proposition 5 and, for conve- 
nience, we formulate it as Lemma 4 below. 

3.4.1. Auxiliary result. The result cited below concerns a lower bound for 
estimators of an arbitrary mapping in the framework of abstract statistical 
model. We will not present it in full generality and below a version reduced 
to the estimation at a given point is provided. 

Let J- he & non-empty class of functions and let F : R — y R be an un- 
known function from model (1.1)-(1.2). The aim is to estimate the functional 
F(t), te[-l/2,l/2] d . 

Introduce the following notation. For any given F,G £ J- set 



Z(F,G) = U 



i=l 



p{Yj - F(Xj)) 
p{Y l -G(X l )) 



Lemma 4. Assume that for any sufficiently large n > 1 there exist a 
positive integer N n , c > 1 and functions Fq,..., Fn„ £ T such that: 



\Fi(jb) - F (t)\ = A„, 



(3.20) 
(3.21) 

" ViVn j=l 
Then, for r > 1 and any t £ [-1/2, l/2] d 



Vi 



1,...,JV„; 



j=i 



<c. 



inf sup [Wf'\F(t)-F(t)\ 
F FeT 



1 
> - 
~ 2 



c + 3 
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3.4.2. Proof of Theorem 4- The proof is based on the construction of 
Fq, . . . ,i ? /v n satisfying conditions (3.20)-(3.21) of Lemma 4. 

1°. Firstly, we construct Fq, . . . , Fjv„ and verify (3.20). Let w : R — > K be 

such that supp(w;) C (-1/2, 1/2) , w G H(/3, 1) , |M|oo < oo , and w(0) / 0. 

i 
Put h = (a(L 2 ?7,)~ 1 ln(n)) 2/3+1 , where a > will be chosen later. Define 



(3.22) 



f(z) = Lh /3 w(zh~ 1 ) , ze 



For b > put N n = n assuming without loss of generality that N n is an 
integer. The value of b will be determined later in order to satisfy (3.21). 
Let {i?j, i = 1, . . . , N n } C S d_1 be defined as follows: 

#i = (O^A^A • • • , 0) T , ef ] = cos(i/N n ), 0< 2) = sm(i/N n ). 
Finally we set 



(3.23) 



F = and Fi(x) = f{&]{x - t)),% = 1 



,...,N n . 



Obviously, / defined by (3.22) belongs to H(/3, L) so all Fi are in the class 
T = F^(/3, L) . Moreover, for any i = 1, . . . , iV n 

|Fi(t) - F (t)| = |w(0)|LWi (an-Mn^)) 1 ^ 1 = |u;(0)|oaA^„(j9,Ir). 

We see that (3.20) holds with A n = |w(0)|a^+i^ n (/3,L) . 
2°. Note that 

2 



E 



(n) 

F 



AS 



£e£>2(F,F )]+-L £ ^N^^Fo) 



m 



i=i 






We have 



Eff^C^-fo)} 



E 



Fo \ 



Z(F,F )Z(F, 



^o)} 



i p(y) 

p(?/-F j (x))p(y-F fc (x)) 

p(y) 



g(x)dxdy [> ; 

g(x)dxdy 
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Since limn^oo sup i=lj ^ Nn \\Fj 
for all n large enough 

-p 2 {y-F j {x)Y 



0, we have in view of Assumptions 6 and 7 



/ 



p(y) 



(3.24) 



g(x)dxdy < 1 + £} / F- (x)g(x)dx 

jR d 

<l + £} 5 / Ff(x)(l + \x\f)~ 1 dx; 

JR d 



p(y - Fj(x))p(y - F k (x)) 



p(y) 



(3.25) 



g(x)dxdy < 1 + 1} / \Fj{x)F k {x)\g{x)dx 

JR d 

<1+Hg / \F j (x)F k (x)\(l + \x\fy 1 dx. 

JR d 



J± 



( - sm(j/N n ),cos(j/N n )) and i? i± = (0^,0, ... ,0) eS 



T CL Sd-1 



Set _ 

Denote for all j = 1, ... , N n by 0j, the orthogonal matrix (t?j, #j_|_, e3, . . . , e^) 
where e s , s = 3, . . . d, are the canonical basis vectors in R . Integration by 
substitution with Qi(x — t) = v gives 



dx 



L 2 h 2 ^ 



w 



! ^i//i)(1 + |x + 97t;|1 7 ) X dv 



< C m L 2 \\wgh 2l3+1 = aC7 ro ||u;|^n^ 1 ln(n), 



/ F 2 (x){l + \x\l 

jR d 

(3.26) 

where we have denoted C m = f R d-i (l + |t + v|;f) dv, x = (x 2 , • • • , Xd) T , 
and v= (v 2 ,...,v d ) T . 

We deduce from (3.4.2) and (3.26) that for n sufficiently large 



(3.27) 



sup E^{Z 2 (Fj,F )} < n aQsC ^ w ^l 



For any j ^ k set J k = (z?j , $&, e3, . . . , e^) . By changing of variables 
with Qj^(x — t) = v we have 



i \F j (x)F k (x)\(l + \x\^) X dx 

JR d 



< \dzt{® hk )\- 1 L 2 h 2 P \w{ Vl /h)w{v 2 /h)\(l + \x + ^l 



R d 



-1 1^7 
'3,k V \2 



dv 



< |det(e j , fc )|- 1 c ro L 2 / 1 2/3+2 || U ;||? : 



where we have put c TO = f Rd -2 (l + l x + Y \ 2 ) dv, v = ( v 3> • • • > v d) T an d 
x = (X3, . . . , Xd) T ■ Note that 

| det(0j- fc )| = | cos(j/iV n ) sm(k/N n ) - cos(k/N n ) sm(j/N n )\ 
= | sin ((k - j)/N n ) | > sin (l/N n ) > (2N n )~\ 
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for sufficiently large n. Thus we finally get 

/ \Fj{x)F k {x)\(l + \x\%y l dx < 2ac UJ \\w\\\n- l \n{n)[N n h}. 
Hence, choosing b < 2/(2/3 + 1) we obtain for all n large enough 

(3.28) sup / \F j (x)F k (x)\(l + \x\^y 1 dx<2ac m \\w\\lr~ 

j^k; j,k=l,...,N n jR d 

We deduce from (3.4.2) and (3.28) 

(3.29) sup E%>\z(Fj,Fo)Z(F k ,F )} < e 2aQ » c ^l 

j^k;j,k=l,...,N n <- > 

We obtain from (3.27) and (3.29) 

N n s 2 



;n 






2aQgc n7 ||iu||? 



Choosing a = 6(0flC a7 ||«;||2) we see that (3.21) holds with c = 1 + 
e 2 0Cro '' M '" 1 . Since that the constant c appeared in (3.21) is chosen inde- 
pendently of L, the assertion of the theorem follows from Lemma 4. | 



3.4.3. Proof of Theorem 5. To prove the theorem we will exploit the 
ideas developed in Lepski et al. (1997). Moreover, our considerations are, to 
a great degree, based on the technical result of Lemma 5 below. Its proof is 
postponed until Appendix. 

Lemma 5. Grant Assumptions 4 and 5. Then, for any p > 1, < s < b 
and Q > we have 

sup || A£ ff (M|L< 2r p g/ l l/C|| 00 [2 s " - 1]4 , V/i > 0. 
p(*,C) 



Here r p is a depending only of p constant from the (p,p)-strong maximal 
inequality. 

Proof of Theorem 5. It is suffice to prove the theorem only in the case 
r > p. Indeed, remind that the risk IZr (•, •) is described by the L r norm 
on [-1/2,1/2], therefore 

Ki n \;-)<n<?\;-), r<p. 
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Hence the case r < p can be reduced to the case r = p. 

In view of Theorem 2 in order to obtain the assertion of the theorem it 



suffices to bound from above 



ln(rt) 



1/2 

r/2 



Set r = {y G [-1/2, 1/2] : h* KJ (y) =-- l} and T k ---- {y G [-1/2, 1/2] : 
h* K j(y) G (2 _fe , 2~ fe+1 ]n [/imin, l] } for k = 1, 2, . . . . Later on, the integration 
over empty set is supposed to be zero. We have 



ln(n) 



nh' KJ (-) 



E 

fc>i 



i\ 



ln(n) 
nh* Kf (y) 



dy 



ln(n) 
o \nh* Kf (y) 



dy. 



Later on Cj, i = 1, . . . , denote constants independent on n, / and L. 
The definition of Tq implies 



(3.30) 



ln(ro) 



To \ n hhj(y) 

We have in view of (2.2) for any k > 1 



dy < c\ [n x ln(n)] 



(3.31) 



KkA h hf(y)>y) 



nh* Kf (y) 



, Vy G T fc . 



Let < q k < r be a sequence whose choice will be done later. We obtain 
from (3.31) 



fc>i 



E 

fe> 
(3.32) 



ln(n) 



nh* Kf (y) 



dy < c 2 ^ 



<c 2 £ 



fc>i 



fc>i 



ln(n) 



ln(n) 
n2~ k 



2 



A b( 2 .»)) d y 



'a, 



■n 



2 -fc 



2 



A 



£,/ 



'2^,2/ 



'//,• 



dy =: H. 



To get the first inequality we have used that A£ t(-,y) in monotonically 
increasing function. 

The computation of the quantity on the right-hand side of (3.32), includ- 
ing the choice of (q k , k > 1), will be done differently in dependence on (3,p 
and r. 

1°. Case (2/5 + l)p > r. Put h* = [L^n' 1 ln(ra)] ^+i and choose q k = p 
if T k < h* and q k = if 2~ fc > /i*. 



28 O. LEPSKI AND N. SERDYUKOVA 

Applying Lemma 5 with p = p, s = (3 and Q = L we get 



S < %z? E 



k: 2~ k <h* 



ln(n) 
n2~ k 



r — p 
~2~ 



~k/3p 



_ /ln(n) 



+ c 4 



nh* 



(3.33) < c 5 



tfin^Wn))^ E 2~ fc K- 1 ?] + 



fc: 2~ k <h* 



ln(n) 
n/i* 



Because in the considered case (3p — ^-f- > 0, we obtain 



S < c 6 



L p (ln(n))— (/i*) 



^ I 2 £ + 



ln(ra) 
n/i* 



It remains to note that h* is chosen by balancing two terms on the right-hand 
side of the latter inequality. It yields 



(3.34) 



rH 



<c 7 LV>+~ 1 (n _1 ln(n)) 2 ' 3 + 1 . 



The argument in the case (2/3 + l)p > r is completed with the use of Theo- 
rem 2, (3.30) and (3.34). 

2°. Case (2/3 + l)p = r. Put h* = 1 and choose qk = P for all A; > 1. 
Repeating the computations led to (3.33) we get 



(3.35) 



E< c 8 ln(n)L p (n _1 ln(n)) 



r — p 
~2~ 



Here we have used that (3p — -^ = and that the summation in (3.32) 
runs over k such that 2~ k > /i m in> since otherwise T^ = 0. It remains to 
note that the equality (2/3 + \)p = r is equivalent to p/r = 1/(2/3 + 1) 
and (r — p)/2r = (3/(2(3 + 1). The assertion of the theorem in the case 
(2/3 + \)p = r follows now from Theorem 2, (3.30) and (3.35). 

3°. Case (2(3 + l)p<r. Choosey = rif2" fc < h* and q k = p if 2~ k > h* , 
where the choice of h* will be done later. 

The following embedding holds, see Besov et al. (1979): N p (/3, L) C N r ((3- 
1/p + 1/r, cgL) . Thus, applying Lemma 5 with p = r, s = (3 — \jp + 1/r and 
Q = cqL we get 



-i 



(3.36) 



E 

2- fc < 

E 



k:2~ k <h* 



ln(n) 
n2~ fe 



"9fc 



A kA 2 



l-k 



'Ik 



dy 



k:2~ k <h* 



A hf( 2l ~ k >y 



dy < c 9 L r (/i*) /3r " (r/p)+1 . 
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Applying Lemma 5 with p = r, s = (3 and Q = L we get 

k:2~ k >h* V 7 



= CHjL^n^lnH) 1 ^ J^ 2- fc [ /9p -T i! ] 

fc:2- fc >h* 

(3.37) < cnL p (n~ 1 ln(n)) I? (/i*) /3p - 1 ^. 

Here we have used that f3p— ^> < 0. In view of (3.36) and (3.37) we choose 
h* from the equality: 

L r( h *^r-(r/p)+l = l p ( n ~l l n ( n )) ^ (h*)^"^ . 

_ i 
It yields h* = (L _1 n _1 ln(n)) 2 P- 2 /p+ 1 and we obtain finally that 

r(l/2-l/r) r(0-l/p+l/r) 

(3.38) E< C12L0-VP+ 1 /* (ra -1 ln(n)) V-a/p+i . 

The assertion of the theorem in the case (2/3 + l)p < r follows now from 
Theorem 2, (3.30) and (3.38). I 

4. Appendix. 

4.1. Proof of Lemma 1. 

Proof of the first assertion. The symmetry of the kernel JC, see Assump- 
tion 4, implies 

»s'(-0*,/i)(^/i)(-) = £V,/i)(-^)(-)' s , (_ l/]h )(-) = s r (i/ ) / l )(-)- 

Therefore it suffices to prove the first assertion of the lemma under the 
condition u T 9* > 0. In this case EiQ*^\t v ^ = -E(0*,h)(^/i) and we note that 



(4.1) Erg. 



E (6*,h) + E (u,h) 



(9*,h){u,h) 

For any 6 = (6>i,6> 2 ) € S 1 let 0_l = (-6» 2 ,6»i). Using (4.1) we obtain 
S(e*, h) (y, h ){x) = J K(u)f(h[9* + u] T 6* Ul + [9*± + u ± ] T e*u 2 + x T 6*)du 
= f f K{u l )K{u2)f{h[l + u Y e*}ui + ule*U2 + x v e*)du l du 2 . 
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We also have 
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S(„,h)(x)= I [ K{ui)K{u 2 )f(hv Y 6*u l + v{6*U2 + x T 6*)&u l &U2 

Put S*(x) = f JC(u2)f(iyJ_0*U2 + x T 6*)du2 and consider two cases. 
1°. v jQ* = o. In this case 5*(x) = f(x T 8*) and 



S(y,h)( x ) = / )C{ui)f(hui + x 6*)dui = - / K. 



t — x 



T, 



S{0*,h){v,h){x)= JC(ui)f(2hui+x 9*)du 



2h 



K, 



h 

t — x 



2h 



f(t)dt, 
f(t)dt. 



Here we have used that uj_6* = together with v 0* > implies v = 9* 
Thus, we obtain 



(4.2) 



\S( 6 *,h)(v,h)(x) ~ S( u , h )(x)\ 

< \S(e*,h)(v,h)(%) ~ K( x )\ + \S(v,h)( x ) ~ s t( x )\ 

< A KJ {h,x T e*) +A KJ (2h,x T e*) < 2A* KJ (2h,x T 9*) 



2 . v\_9* 7^ 0. In this case we have 



s;(x) 



S($* : h)(v,h)( x ) 



K 



Vl 



KL 



h(l+v ] 'e*)J'^\ \v[6*\ 

h(l + i/ T 9*)\i/Je*\ 



K 



K. 



V2—X 



h(l+v T 8*)J \ \vje*\ 

h(l + v T 9*)\vj9*\ 



f(v 2 )dvidv2 



f(y\ + V2)dv\dv2- 



Here we have used once again the symmetry of /C. Thus, taking into account 
that |z; T #*| < 1, we get 



\S{0*,h){v,h)( x ) ~ St{x)\ 



< 



Ja*\ 



K 



V 2 - X 



T a* 



,T a*\ 



sup 

8<2h 



]z(j) [f(vi + V2)-f(v 2 )]dv, 



dvo 



< y/ciioosup 

a>0 



i rx T 8*+a/2 

- / sup 

a Jx T e*-a/2 6<2h 



J]k(j) [f(vi + v 2 )-f(v 2 )]dv 1 



d^2 



Here we have used that supp(/C) C [—1/2, 1/2] (Assumption 4 (1)). Hence, 
(4.3) \S {0 . Mv , h) (x) - St(x)\ < \m\ oc A* KJ {2h,x T 9*). 
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If v 0* 7^ we obtain by the same computations 

\S^ h) (x) - S* u (x)\ < \m\ oo A* KJ (h,x T 0*). 
Noting that S(„ ih )(-) = S*(-) if i/ T 8* = we get 

(4.4) \S( V)h) (x) - St(x)\ < \\JC\\ 00 A* KJ (h,x T 9*), 
that yields together with (4.3) 

(4.5) \S(g* >h )(y >h )(x) -S( v>h )(x)\ < 211^1100^^^(2^, x 1 "^*) . 

Finally, taking into account that in view of Assumption 4 (_/) ||/C||oo > 1) we 
obtain from (4.2) and (4.5) that 

\S { 6*, h)i „, h) (x)-S { „ jh) (x)\ < 2\\tC\\ O0 A* ZJ (2h,x T e*) 

< 2\\IC\\ OD A* CJ (h},x T 0*), 

since we consider h such that 1h < h**. The definition of h* f implies 

A* KJ (h* f ,x T 0*) < (h^-^WKWooy/n^Hn) 

and the first assertion of the lemma follows. 

Proof of the second and third assertions. In view of (4.4) for \/n < h < h* f 

\ s (y,v)( x ) ~ S (v,h)( x )\ - \ S (",v)( x ) ~ Sl(x)\ + \S(y th )(x) - 5*(x)| 



< ll/CI 



Al J (r,,x T 0*)+Al J (h,x T 0*) <2\\}C\\ oo A* KJ (h,x T 0*) 



< 2\\IC\\ 00 A* KJ (h* f ,x T e*) < 2(h})- 1 / 2 \\lC\\ 2 ocV / n- l ln(n), 

in view of the definition of h*r. The second assertion is proved. 
We have for any h < h% 

; ljK{l)[f(u + x T 9*)-f(x T 0*)]du 



\S(e*,h)( x ) - F(x)\ 



< A KJ (h,x T e*) < Aij(h,x T e*) < A* KJ (h* f ,x T e*) 



= mr^Wm^e^n-Hnin), 
in view of the definition of h%. The third assertion is proved. 
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4.2. Proof of Lemma 2. The proof of the lemma is based on auxiliary 
result (4.12) justified in part 1° below. 

1°. Let (V, m) be a measurable space and let Z C [— B,B] s ,s > 1, for 
some given B > 1. Let V\, . . . , V n be i.i.d. V- valued random variables and 
later on P denotes the probability law of V\ , . . . , V n . 

Suppose we are given by G : V x Z — > E and consider the random field 

1 - 
C n (z) = -=Y\G(V i3 z)-EG(Vi,z)], zeZ. 



1 
-=y2[G(yi,z)-EG(Vi 

.7 = 1 



Below we will be interested in establishing of a tail probability inequality 
for sup 2S £ |C n (z)|. To do that we will apply Proposition 1 in Lepski (2013) 
with the functional ^(-) = | • | . It is convenient to impose the following 
assumptions. 

(4.6) sup sup \G(v,z) | =: G*^ < oo. 

veY zez 

There exist a S (0, 1] and R > 1 such that 

(4.7) b oo (z,z'):=sup\G{v,z)-G{v,z')\<R\z-z'\% , Vz,z'eZ. 

Here | • |oo denotes the vector supremum norm on M s . Note that 



(4.8) a(z, z') := y/2E\G(V!, z) - G(V U z')\ 2 < yfib^z, z'). 

Set b = (4/3)n~2 6 00 and equip Z with the semi-norms a and b. Denote also 
by Q: a (e), <£b(e), e > 0, the e-entropy of Z measured in the metrics generated 
by a and b respectively. Put finally £(5), the <5-entropy of [— B, B] s measured 
in | • |oo. Since obviously (E(<5) < s [ In (S/J) ] , we obtain in view of (4.7) 
and (4.8) for any e > 0, taking into account that B > 1, a < 1, 

(4.9) (S a (e) < g('[2v / 2i?]~ 1/a e 1/Q ) < - [in (2v / 2Si?e~ 1 )] ; 

(4.10) 6b(e) < e([8i?/3]- 1/Q [v^e] 1/Q ) < -[^([S/Sl-B^n-^e- 1 ) 



o 



where [t] + denotes the positive part of t. Here we have used also the fact 
that e-entropy of Z is less than (e/2)-entropy of [— B,B] S whatever is the 
metric in which these entropies are measured. 

Set s(u) = (3/4)u _4 ,u > 0, and introduce the quantities 

e^HsuprW^Y e b (x) = su P r 1 C: b (^), x > 0. 
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First, we note that the bounds (4.9) and (4.10) guarantee, for any x > 0, 
(4.11) e a (x) < oo, &b{x) < co- 

Next, denoting G 2 = sup ze2 : (E|G(Vi, z)\ 2 ) , we get in view of the triangle 
inequality 

sup &(z,z') < 2V2G* 2 , sup b(z,z') < (8/3)n~5G^, 

z,z'ez z,z'ez 

that implies that <£ a (e) = for all e > 2y/2G 2 , and <S b (e) = 0, for all 
e > (8/3)n~2G^ . Hence, for all x\ > G 2 and x 2 > G^, we obtain 



2>/2xi) = sup<5- 2 (S a f2v / 2xi(48<5)- 1 s((5) N ); 

' 5><5 V ' 

e b ((8/3)n-^x 2 ) = sup <rW (8/3)n-2x 2 (485)- 1 s(5)) , 
v y 5><5 V y 

where <5o = 2~s. That together with (4.9) and (4.10) leads to 

e a (2V2xi) < (s/a){5.3[ln( J Bi?x 1 - 1 )] + + 3.2} ; 

e b ((8/3)n-^x 2 ) < (s/a)J2.3[ln (BRx~ l )] + + 0.85}. 

Put finally e(x x , x 2 ) = (s/a)J7.6[ln (fii?[xi A x 2 ] _1 )] + + 4.l} and define 
for any r > 1 



f/ n (xi,x 2 ) = 2A/2X1 \/24e(xi , x 2 ) + 8 r ln(n) 

24e(xi,x 2 ) + 8rln(n) 



8 
3v™ 



It remains to note that the verification of the assumptions of Proposition 1 
in Lepski (2013) follows from (4.6), (4.8), (4.11) and Bernstein's inequality. 
Applying the proposition with *!'(•) = | • |, e = y2 — 1 and y = 8r ln(ra) we 
obtain for any n > 3 and any r > 1 

(4.12) pjsup|CnWI > U n {K U x 2 )\ < 4n~ 4r . 

2°. Let us prove that 

(4.13) FP\ sup \vn,t{E)\ > C7i(n)[|JP||oo) < 4n" 4r . 
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Considering a 2 x 2-matrix as an element of R we obviously have that the 
defined in (3.2) set of matrices £ a>A C [-B,B] 4 , with B = (V2a)~ l A. We 
apply the result from 1° with V = R 2 , Z = E a ,A , z = (en, ei2, e2i, 622) , 
where e%j are the entries of the matrix E , 5 = 4, and G(-, z) = J(-,E)F(-) 
that corresponds to {( n (z),z G Z} = {r) nt t(E),E G £ a ,A} ■ 
First, we note that 

(4.14) |det(£)| > a, VE G £ aA . 

Indeed, |det(£)| < 2\E\ 2 00 < 2(V2a)~ 2 \det(E)\ 2 = a" 1 ) det(£)| 2 , and 

(4.14) follows. 

Next, in the considered case, since supp(/C) C [—1/2, 1/2] 2 , after changing 
the variables we have 

(G* 2 ) 2 = sup / K 2 (u)F 2 (t + E- 1 u)g- 1 (t + E~ l u)du. 

EeS aiA J [-l/2,l/2] 2 

Note that l^ -1 ^ = | det^)]" 1 ^^ < (V2a) _1 for any E G £ aA and, 
therefore, 

sup F 2 (t + E- 1 u)g- 1 (t + E- 1 u) < sup F 2 (x)g- 1 (x) =: UJ^g), 
«e[-i/2,i/2] 2 xeA a 

where A a = [ - \ - -^L, \ + -j=\ 2 . Thus, 

(4.15) G^<||/C|| 2 ^ a , 2 (F, 5 ). 

By the same reasons denoting TZ&,oo(F,g) = su PxeA a l-^X^Ob -1 ^) we obtain 

(4.16) G^<VI||;C||;^a,«^<7)- 

Since F G F(/3o, M) assumption (4.6) is fulfilled as soon as inf xe A a g{x) < 00. 
It remains to check (4.7). By the use of the triangle inequality we get 

b oc (E,E') := sup I J{x,E)- J(x,E') \ \F(x)\ 
< 



det(E)\ - A/|det(£")| ||/C||^ ai00 (F,s) 
(4.17) + VI sup {\K(E(x-t)-K(E'(x-t))\\F(x)\g- 1 {x)\ 

First, we note that for any E, E' G £ a ,A 

K(E(x-t) = 0, K(E'(x-t) = 0, Mx + A a . 
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Second, Assumption 4 yields \K(y)-K(z)\ < 2Q\\lC\\ 00 \y-z\ 00 , \/y,z G R 2 , 
that together with the previous display gives 

VAsup\\K(E(x-t)-K(E'(x-t))\\F(x)\g- 1 {x)\ 

< 4\/lQ||/C|| 00 [l + (V2a)- 1 ]'R^ 0O (F,g)\E- E'^ 

(4.18) < UQWKWleiA/a^ooftg^E - E']^. 

Here we have also used that a < 1, A > 1 and ||/C||oo > 1- Next, using obvious 
for 2 x 2-matrices inequality |det(£) - det{E')\ < 4[|E| tX3 V |-E'|oo] l-E-.E'l^ 
we get in view of (4.14) 



(4.19) yJ\det(E)\ -y/\det(E')\ < y/2(A/a)\E - E 1 ^, VE,E'e£ ajA . 
We conclude in view of (4.17), (4.18) and (4.19) that (4.7) is fulfilled with 

(4.20) a = l, R = (A/a)K at00 (F,g)\\IC\\ 2 00 [l2Q + V2\. 



2+ui 



Let now a = 1/8, A = (/t m in) l = n(ln(n)) 

First we note that A a = [ — 5/2, 5/2] and, therefore, in view of Assump- 
tion 2 we have 

n^ 2 {F,g) < £-5 H^Hoo K ai00 {F,g) < 5 -1 ||^||cx,- 
It yields together with (4.15) and (4.16) 

(4.21) G& < n^H^y^g-HmoWFWoo =■ »2. 



Here we have also used that ||/C||2 < Halloo i n view of Assumption 4. Taking 
into account that h m i n > n~ l we obtain from (4.20) 

(4.22) ^R < 8ti^- 1 || j P|| 00 ||/C||^ [12Q + >/2], 

Thus, putting 

d(n) = 730 In (l6n 2 g~^ [12Q + y/2\\ + 8rln(n) + 394, 
we deduce from (4.21) and (4.22) that (4.12) holds with 

U n (*i,x*) = 2V2g_-^\\lC\\ 2 00 \\F\\ 00 v^) 

+(8/3)(ln(7i))-^- 1 ||/C||^ ||F|| 00 c 1 (n). 
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It remains to note that U n (?<i, x 2 ) = ||-F||ooCi(n) and (4.13) follows. 
3°. Let us prove that 

(4.23) P^/ sup \Ut{E)\ > C 2 {n)\ < (4 + T)n" 4r ; 

Let v = (x, w) E V := M 3 and set for given r > 

h(v,E) = y/\ det(E)\K(E(x - t))g- l {x)wl { _ T , T] {w); 
I 2 (v,E) = y/\ det(E)\K(E(x - t))g^(x)w[l - 1[_ t , t] M . 

Putting Vi = (Xi,Si),i = 1, . . . , n, we remark that for any r > and any 

E e £ a ,A 

1 n 
e„, t (JB) = -='£{I 1 <y i ,]Z)+l2(V h ]2)} =■ &}(E)+£l(E). 

Noting that s ^j(-E') = > 5 {maxi=i,...,n |c» | < t} we have for any T > 
and any t > 



P xlf SUp \Cn,t( E )\> T ) 

' \Ee£„ A J 



( n ) / o„r. \cW(T?\\ >> T\ ^^T^-^r 



(4.24) < pw sup £#(£) >T +nTe" 

in view of Assumption 1. Thus, it is sufficient to apply the result obtained 
in the step 1° to the random field ^ t (E). It is eligible because the in- 
dependence of {£j}f =1 and {Xi}f =1 and symmetry of E\ provide that 
E xii£}( E )] =°- Moreover, we deduce from (4.15) and (4.16) 



G* 2 < a\\K\\in at2 (l,g), G*^ < rv / ^||/C||^ ^ a , 00 (l, 5 ), 

where 1 denotes the function identically equal to 1. Remind also that <r 2 
supp 6 fp J R x 2 p(x)dx. Note that for any v = (x,w) 

\h(v, E) - h {v, E')\<t\ J(x, E) - J(x, E') | 

and, therefore, we obtain from (4.20) that (4.7) is fulfilled with 

a = 1, R = r(A/a)fta ) oo(l,0)||£|& [12Q + y/2\. 
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Let a = 1/8, A = /i~ in = n(ln(n)) and choose the truncation level 

r = (n -1 (4r + l)ln(?i)) 1/w . We have 

G* 2 <(a\fl)g-h\\]C\\l =:x 1 , 

(4.25) G£, < n^(^ 1 (4r + l))^(ln(n))"V 1 ||/C||L == «S- 

Here we have also used that ||/C|| 2 < ||/C||oo i n view of Assumption 4. Taking 
into account that h m m > n~ l we get 

(4.26) R < 8n(n-\4r + 1) ln(n)) » g 1 ||/C||£, [12Q + V2] , 
Note also that x\ A x 2 > g~2 ||/C || ^q since Q < 1. Thus, putting 

c 2 (n) = 730 In (l6n 2 (fT * (4r + 1) ln(n)) "5~5 [12Q + V2} ) + 8r ln(n) + 394, 
we deduce from (4.25) and (4.26) that (4.12) holds with 
£/■„(*!,*!,) = 2V2(aV l) £ -3 ll/Cll^v^M 

+ (8/3)c 2 (n)(ln(n))-V 1 ||/C||^ o (0- 1 (4r + l))- C2 (n). 



It remains to note that U n (>ci, x 2 ) = C2(n), our choice of r provides 
nexp{-^T w } = n" r , and (4.23) follows from (4.24) with T = C 2 (n). 
The assertion of the lemma follows now from (4.13) and (4.23). 



4.3. Proof of Lemma 3. We start the proof with the following statement. 
Set 

¥ 2 {^ ,M):=\W:R 2 ^R: ||W||oo + sup \ W ^~ W ^\ <m\. 
{ y , y 'eR2 ly-y'l^ J 

Then, for any / £ F(/3o, M) and any 0* 6 S 1 one has 

(4.27) FGF 2 (/3 ,M), 

for any F satisfying (1.2). Indeed, since the Cauchy-Schwartz inequality 
provides that |(y - y') T 9*\ < |y — y'| 2 for 6* G S 1 , we have 

I J-(yi) - ^(ift)| l/(y T ^)-/(y /T ^)l 

Halloo + SUp ■ -g- = ll/lloo + SUp ■ -gr 

y,y'eR 2 \y-y\2 y,y'^ 2 |y - y l 2 

< / 00 + SUp — : rg— - SUp : < M, 

i/i.l/aSR Iz/i-SfeP y ,y'eM 2 V |y — y I2 / 
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as a consequence of Assumption 3. The inclusion (4.27) implies 

B 
(4.28) 



sup I K h (x-t)[F(x)-F(t)} 

te[-i/2,i/2] 2 Jm 2 

< pcii?Af&* < mi, 



d.r 



in view 



of (1.6). Here we have also used that K, is supported on [—1/2, 1/2]. 
Define T(x,t) = i)K h (x - t)g~ l (x), and put for any t G [-1/2, 1/2] 2 

4 = 1 
1 - 

With this notation the standard "approximation + stochastic part" decom- 
position of the estimator F(t) reads as follows: 

F(t) = [ K t) (x-t)F(x)dx+{n[ ) 2 yH'nn(t) + Ut))- 

That gives 

(4.29) IHFIU - IIFIUJ < \\Kg + (nf, 2 )^ (\\jjjn + ||Uoc) . 

We will prove that 

(4-30) F^jll^Hoo > C 3 (n)||*1|oo} < An 



- -lr 



(4.31) 



g{lkJ|oo>C 4 (n)}<(4 + T)n- 4r , 



where Cz{n) an C±(ri) are given in Section 3.1. 

Let us show how to deduce the assertion of the lemma from (4.30) and 

(4.31). Indeed, remembering that C§(n) = ||/C||f + (nt) 2 ) ^Ci(n) and that 

63(72) (nf) 2 ) 5 < 1/2 for any n > n\ in view of restriction (2.6), we obtain 
from (4.29) 

{Halloo < CsHUFIU} n {]|e„lloo < C 4 (n)} 
C {lll^Hoo - II^IUl < C 5 {n) + 2^ 1 ||F|| 00 } 



{ 



FooG 



| J F|| 00 ,3M + 4C; 



(«)]}■ 
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It yields for any F £ F 2 (/3 , M) 



i n) |^oc i [||F|| 00 ,3M + 4C 5 (n)]| 



< ^{ll^nlloo > CaCnJII^Hoo} +Pg{lia°c > C 4 (n)}, 

and the assertion of the lemma follows from (4.30) and (4.31) since the right 
hand side of the inequality is independent of 9* and /. 

Thus, let us proceed with the proof of (4.30). It is based on the application 
of inequality (4.12) established in the proof of Lemma 2. 

1°. We apply (4.12) with ( n (z) = rj n (t)) , z = t, V = R 2 , 5 = 2, B = 1, 
Z = [-1/2, 1/2] 2 and G(; z) = T{-,t)F(-). 

Since K is supported on [—1/2, 1/2] 2 and rj < 1 for any n > no we obtain 

(4.32) T(x,t) = 0, Vx/ [-1,1] 2 , Vte [-1/2, 1/2] 2 . 
It yields together with Assumption 1 and (4.27) that 

(4.33) G* 2 < HFIU^II/CIIL =: x lf G^ < f." 1 Il^lloo^" 1 ||^||L == **• 

We see that (4.6) is fulfilled. Now let us check (4.7). We have in view of 
(4.32) and the triangle inequality for any t,t' € [—1/2, 1/2] 2 

6 0O (t,t / ) := sup \T{x,t)-T(x,t')\\F{x)\ 

XG[-1,1] 2 

< bWFWoog- 1 sup \K h (x - t) - K^x - t')\ 

X'6[-l,l] 2 

(4.34) < 2r 2 ||i ? ||ooQ^ 1 ||/C|| 00 |t-t / | oo . 

The latter inequality follows from Assumption 4. Thus, we obtain that (4.7) 
holds with 

(4.35) a = l, # = 2ra 2 Qg~ 1 ||/C|| 2 J|F|| 00 . 
Here we have used that t) > n _1 and ||/C||oo > 1- Thus, putting 

c 3 (n) = 365 In (2n 2 Qg~^ +8rln(n) + 197, 
we deduce from (4.33) and (4.35) that (4.12) holds with 

U n (x 1 ,x 2 ) = C 3 (n)||F||oo, 
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where remind, C 3 (n) = 2V2^5||/C||^ v / ^)+(8/3)£- 1 ||/C||^c 3 (n)(nf) 2 )"i 
Thus, (4.30) is established. 

2°. To prove the inequality (4.31) we first note that similarly to (4.24) 
we have for any y > and any r > 



{ xl( sup \Ut)\>y) 

Vte[-i/2,i/2l 2 / 



.te[-i/2,i/2] 2 

(4.36) < P<? £ ( sup | ei 1} (t) \>y)+ nTe-^ 

' \te[-i/2,i/2] 2 / 

in view of Assumption 1. Here we have denoted 

1 n 

H\t) = ^r(x l ,t) e a [ - T , r ]fe). 

We will apply (4.12) with Q n {z) = tn\t), z = t, Z = [-1/2, 1/2] 2 , s = 2, 
B = 1 and G{v, z) = T(x, t)wl { _ T , T] (w), v = (x, w) G V := M 3 . 
We have in view of Assumption 1 and (4.32) 

(4.37) G* 2 <{oV %^11/CHL =: xi, G^ < rrV^II^ == ^ 
and, therefore, (4.6) holds. 



Using the last inequality in (4.34) we obtain for any t,t' £ [—1/2, 1/2] 
boo(t,t') := sup \T(x,t) -T(x,t')\\w\l[_ T:T ](w) 



2 



< 2rr 2 Q5" 1 ||/C|| 00 |t-t / | oo . 
Hence, (4.7) is fulfilled with 
(4.38) a = l, R = 2Tn 2 Qg- l \\tC\\l . 

Choose r = (fl _1 (4r + 1) ln(n)) u and note that x\ A >q > g~? \\fc\\^o since 
VI < 1. Thus, putting 

c 4 {n) = 365 In f2n 2 (0~ 1 (4r + l)]a(n))"g~^Q\ + 8rln(n) + 197, 

we deduce from (4.37) and (4.38) that (4.12) holds with 

U n (x 1 ,x 2 ) = 2yft{o V l)£-*||/C|&< s /c4(n) + (8/3)-rc 4 (n)(nf| 2 ) _ ^£- 1 ||/C||^ . 
Since U n {xi, x-i) = C^n) , choosing in (4.36) y = 64(71), we come to (4.31). 
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4.4. Proof of Lemma 5. First, in view of the (p,p)-strong maximal in- 
equality, see e.g. Theorem 9.16 in Wheeden and Zygmund (1977), one has 

||A^(/v)!I p <7pI|Ak;, 9 (/v)II p , 

where the constant t p depends only of p . 

For any 5£ (0, /i] put B(z,S) = (T 1 / K([u - z]/5)(g(u) - g(z))du and 
define 

AP(h,z)= sup B(z,S), n = l,2,... . 

Se[hn-\h] 

We remark that the sequence {Aj^ 1 (h, -)} n >i increases monotonically and 

K aft, z) — > Afc } g(h, z) for any z G M , as n — > oo . Hence, by Beppo-Levi's 
theorem 



|A*;, g (MII 



= lim 

r n— >oo 



a£(m 



and, in view of (4.4), to complete the argument we need to show that 



(4.39) 



sup 

<?6N p (s,Q) 



AJg,(M < 2Q/i s ||/C|| 00 [2 S P - 1]^p , Vn>l. 



Assumption 4 (2) implies that we can assert that B{z,-) is continuous on 
[n~ 1 h, h]. Hence for any z£l there exists 5(z) G [n~ 1 h, h] such that 



(4.40) 



» 



Al>(h,z) = B(z,5(z)). 



For any 1 = 0,..., log 2 n — 1 (without loss of generality log 2 n is assumed 
an integer) we consider the slices 1^ = {z £ 1 : a^ +1 < 5(z) < a/} with 
a/ = 2~'/i. Later on the integration over empty set is supposed to be zero. 
Then 



(4.41) 



41 



(M 



log 2 n-l 



£ / |S(M(*))l p d*- 



We will treat the cases s < 1 and s > 1 separately. If s < 1 , on any slice V\ , 
/ = 0, . . . , log 2 n , we have 



B(z,5(z)) < 



\JC\ 



< 



(4.42) 



S(z) 

oil t^ll 



"•'■--1 

2 
2 



|s(z + t>) - g(z)\dv 

\g(z + v) - g(z)\dv 

\g(z + tai) - g(z)\dt. 
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We obtain from (4.41) and (4.42) with the use of Minkowski's inequality for 
integrals and writing for ease of notation \x = 2||/C||oo that 



4>-) </E 



g(ta h + z) - g(z)\dt 
1=0 
log 2 ra-l / i \ P 

< M P E \\\9(-+*H)-9U\\ 9 to) 



dz 



1=0 



< 



Oh s \\1C\\ ol-sl* 1 °° 



(*+l) 



1=0 



Here we have used that g £ N p (s, Q). Thus, we have for any s < 1 and any 
n> 1 



(4.43) 



sup 

9£N p (s,Q) 



» 



A£i(M <2Q^||/C|| 00 [2^-1]- 



If s > 1, using Taylor's formula we have for any g £ N p (s, Q) any u G 



K« + *)-0(*O = 2 



g^{z) 



m=l 



??l! 



(1 - A)" 15 " 1 U (ms) (z + v\) - g {ms) {z) 



{m s - 1)! J 
We have in view of Assumptions 4 and 5 for any z £ 

Halloo 1 



dA. 



B(z,5(z)) < 



(m s - 1)! 8(z) 



6{z) 



S(z) 



(1 - A)™*" 1 g( m °> (z + Aw) - 5 (ms) (z) 



(m s ). 



dAdw. 



By the latter inequality for any z E Vj we get 
(4.44) B(z,6(z)) 



< 



211 ATM o ms ft z" 1 

1 1 '^^ 1 1 oo 



(m s - 1)! y_i 



f»(l - A)^" 1 5 ( ms )(z + Atoj) - 5 (ms) (z) 



dAdt. 



Thus, we obtain from (4.40), (4.41) and (4.44) with the use of Minkowski's 
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inequality for integrals and denoting /j, = 2||/C|| 00/ /(m s — 1)! that 

logo n— 1 
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4 n) /M P = E / i£M(-))i Pd 



log 2 n— 1 

< m p E «r p 

Z=0 




7 (m s ) 



t| ms (l - A)™ 8 " 1 ^"^(z + Mai) ~ 9 {ms) {z) 



(m s )/ 



dAdt dz 



log 2 n-l 



< 



/'■" 



Er 
a l 



m s p 



1=0 




t\ m °(l - A)™ 3 " 1 L (ms) (- + Ataj) - 5 (ms) (-) 



dAdt 



< 



Q/i s ||/C|| 00 2 



l-s 



p oo 



1=0 



(s + l){m s + l)(m s - 1)1 
Here we have used that g £ N p (s, Q). Thus, we have for any s > 1 and n > 1 



(4.45) 



sup 

S6N P ( S ,Q) 



A^(M . < 2Q/ 1 S ||/C|| 00 [2*P - 



We conclude that (4.39) is established in view (4.43) and (4.45). 
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